CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
https://arxiv.org/abs/2010.00133
To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs).
CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age.